-
Notifications
You must be signed in to change notification settings - Fork 3.5k
mapfs backend, using a manifest to map contained files to paths #23808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks promising! Some initial comments.
src/lib/libmapfs.js
Outdated
| $MAPFS__deps: ['$stringToUTF8OnStack', 'wasmfs_create_map_backend', 'wasmfs_map_create_manifest', 'wasmfs_map_add_to_manifest'], | ||
| $MAPFS: { | ||
| createBackend(opts) { | ||
| return withStackSave(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The withStackSave can be on the _wasmfs_map_add_to_manifest call. That will reuse the stack each time, allowing a lower stack size to work in more cases.
| // University of Illinois/NCSA Open Source License. Both these licenses can be | ||
| // found in the LICENSE file. | ||
|
|
||
| // This file defines the fetch backend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // This file defines the fetch backend. | |
| // This file defines the MAPFS backend. |
|
|
||
| namespace wasmfs { | ||
|
|
||
| typedef std::map<std::string,std::string> MapManifest; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| typedef std::map<std::string,std::string> MapManifest; | |
| using MapManifest = std::map<std::string,std::string>; |
(I believe this is more "modern" notation)
|
|
||
| typedef std::map<std::string,std::string> MapManifest; | ||
|
|
||
| class MapBackend : public wasmfs::Backend { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| class MapBackend : public wasmfs::Backend { | |
| class MapBackend : public Backend { |
(already in the namespace)
| assert(manifest && "Mapfs: null manifest not supported"); | ||
| } | ||
| ~MapBackend() { | ||
| if(manifest != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if(manifest != NULL) { | |
| if (manifest != NULL) { |
| std::shared_ptr<DataFile> createFile(mode_t mode) override; | ||
| std::shared_ptr<Directory> createDirectory(mode_t mode) override; | ||
| std::shared_ptr<Symlink> createSymlink(std::string target) override { | ||
| fprintf(stderr, "mapfs doesn't support creating symlinks"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fprintf(stderr, "mapfs doesn't support creating symlinks"); | |
| fprintf(stderr, "MAPFS doesn't support creating symlinks"); |
I think this is more consistent with other text?
|
|
||
|
|
||
| class MapDirectory : public MemoryDirectory { | ||
| std::string dirPath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment here. I assume this is either the "true" or the "mapped" path? Also dirPath is perhaps generic - how about truePath or mappedPath?
|
Thanks! I've addressed these comments and found another place the lowercase mapfs was used instead of MAPFS. |
|
When a file is mapped (e.g. |
A "virtual data file" is created, as in the icase backend, which refers to the original file. It is different from icase in that the original file can also be accessed by some path. What should happen/what bad things could happen/what test cases should be added to explore the possibility of the file being opened under two different names? |
Would symlinks be a simpler/quicker solution? Should we add hard link support maybe? |
|
We thought about symlinks before, but this will work even if the C code doesn't do the right thing wrt symlinks. One could argue that such C code should be fixed instead of having a new mapping backend, but it becomes complicated with e.g. whether writes through remapping should modify the location of the symlink or the underlying target location (perhaps not relevant for the fetch backend but maybe relevant in other settings). |
Fair enough, symlink trees are not the same as normal files. How about hard links? @tlively @kripken did you even think of adding support for the same file existing in multiple places? (or more than one file having the same inode number?) |
|
No, I don't think we ever seriously considered supporting hard links. Backends like OPFS would not be able to support them because they could not be reflected in the underlying storage. I don't have enough of the system paged into memory to say what other specific problems might arise from allowing files to be reachable from multiple paths, but I am very wary of the idea. Does this backend make the same file visible from multiple paths, or does it virtualize and provide a mapped view into some underlying backend that is not directly mounted into the file system? I couldn't tell from a quick look at the code. I would strongly prefer the latter design to avoid making a file directly accessible from multiple paths. |
|
This does the former, but I think it’s a small change to make it do the latter:
This would make it more like the icase backend, so I’ll give that a shot over the next day or two. |
|
So I think it's not actually going to be possible with the current API to define mapfs as a front to another, non-mounted filesystem, especially if the wasmfs_create_file API is going away, because intermediate directories &c may need to be created in the non-mounted filesystem. The backend would need specialized API that takes both a real path on the wasmfs filesystem and a virtual path on the mapped-to filesystem backend. Regular open(O_CREAT)/mkdir type calls won't work to create files in mapfs without a much more sophisticated mapping than a simple manifest. Not every filesystem backend allows writes, and already "creating" a file in the fetch backend (for example) is an awkward idea (since files are readonly)---moreover, doMkdir and doOpen also "mount" the created files on the filesystem, so the syscalls both create and mount. So I believe the best solution is to withdraw mapfs and require that C programs handle symlinks properly, and also make fetchfs automatically create files on open (even without O_CREAT)---but even that might not actually be possible, since getChild() can't know if it's getting a data file, directory, or something else. So as I see it, the manifest makes sense as part of fetchfs (mapping filesystem paths and URLs, designating actually accessible files) but has big question marks when implemented as a standalone filesystem. So I think a standalone mapfs has too many rough edges and shouldn't be implemented; either fetchfs should use a manifest (and avoid the need for "creating"/opening fetched files) or we should just require that C programs do the right thing with symlinks. |
|
Thanks for that analysis, @JoeOsborn. It sounds right to me. I would be fine with adding manifests to fetchfs if there is demand for it, but I would also hope that most programs handle symlinks correctly. |
|
I’ll close this as a successful experiment. |
I got obsessed with this problem (discussed in #23668 , #23021) and implemented the mapfs manifest-mapped filesystem idea. From the header:
Mapping backend
Creates a new path mapping backend based on a provided
manifest,which is a mapping from file paths (relative to where the mapfs is
mounted) to absolute file paths This is useful if your C program
expects files at certain locations, but might not handle symlinks
properly. It can also be combined with e.g. fetchfs, in case your
web server has files at several different paths but the C program
wants to see all the files within one directory. The manifest can
be built using wasmfs_map_create_manifest and
wasmfs_map_add_to_manifest. When the mapfs is mounted, directories
and files corresponding to the manifest entries will be
automatically created. Importantly, only individual files can be
mapped through mapfs, not whole directories---a manifest can't map
one directory to another, just one file to another.
For example, a manifest like
{'test.txt':'/resources/file.txt'}mounted at
/dat(using either wasmfs_mount orwasmfs_create_directory) will let you write
open("/dat/test.txt", O_RDONLY).If you have a fetchfs at
/fetch/pointed to server.com and map files from/path/whateverto/fetch/url/path/file, your C code can access/path/whateverin order to lazily fetch the file atserver.com/url/path/file. I can also see it being useful for OPFS, in case the OPFS file hierarchy is not what your C code wants for whatever reason.